Diagnostic Accuracy of Dried Plasma Spot Specimens for HIV-1 Viral Load Testing: A Systematic Review and Meta-analysis

Supplemental Digital Content is Available in the Text.


INTRODUCTION
Of the nearly 38 million people living with HIV, approximately 24.5 million had access to antiretroviral therapy in 2019. 1 Monitoring treatment is critical to ensure people on antiretroviral therapy are on the most effective regimen. Furthermore, achieving viral suppression reduces the risk of onward transmission. 2 Global targets now exist to evaluate the effectiveness of identifying and treating people living with HIV. The last 90 of UNAIDS9 90-90-90 targets measures the proportion of people on antiretroviral therapy who are virally suppressed. 3 Increasing access to viral load testing is essential to support high-quality individual treatment monitoring and to understand individual and overall population suppression rates to minimize transmissions.
The 2016 World Health Organization (WHO) consolidated guidelines on the use of antiretroviral drugs for treating and preventing HIV infection recommend viral load as the preferred monitoring approach to diagnose and confirm treatment failure, with plasma specimens as the preferred specimen type. 4 Although viral load testing has scaled up considerably in low-income and middle-income countries, 5 several challenges remain. In particular, the use of traditional liquid plasma can be difficult for some countries or settings because of strict specimen storage and transport times and temperatures. Most manufacturers of currently approved viral load assays require plasma separation from whole blood within 24 hours of specimen collection. 6 These requirements, therefore, limit the breadth and scope of viral load testing programs. An analysis across 4 sub-Saharan African countries found that approximately only 44% of health care facilities and 52% of people on antiretroviral therapy can access viral load testing using plasma specimens under those conditions. Alternative specimen types and technologies will be critical to support expansion of viral load testing to all in need as national infrastructural projects further develop to allow for improved and expedited transport.
Dried plasma spot cards are an alternative specimen type that requires the application of liquid plasma to a filter paper card. They are similar to dried blood spot cards and specimens, except that plasma rather than whole blood is applied directly to the card. Together with dried blood spot specimens, dried plasma spot specimens may be able to support wider decentralization and access to viral load testing; however, they typically require a centrifuge to separate the plasma from whole blood before application. Dried plasma spot specimens do not require cold chain, can be stored for longer periods of time once prepared, and are safer to transport because they are generally no longer infectious. In addition, they can be prepared by lower cadres of health care facility staff, similar to dried blood spot specimens and point-of-care technologies, further allowing decentralization and task-shifting. [7][8][9][10] Several diagnostic accuracy studies have been published highlighting the performance of dried plasma spot specimens compared with that of traditional liquid plasma for HIV-1 viral load testing in people living with HIV.  Given the significant interest and effort in scaling up viral load testing in resourcelimited settings, it was timely to collate and summarize the findings through a systematic review and meta-analysis.

Search Strategy
A systematic review and meta-analysis was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses following a predefined study protocol 34 (see Table 1, Supplemental Digital Content, http:// links.lww.com/QAI/B756). A search was conducted in Jan-uary 2019 using PubMed and Medline databases to identify peer-reviewed original research with appropriate data for this systematic review and meta-analysis (search terms given in Supplemental Digital Content 1, http://links.lww.com/QAI/ B755). Conference abstracts from the Conference on Retroviruses and Opportunistic Infections, International Conference on AIDS and sexually transmitted infections in Africa, International AIDS Society, and AIDS Conferences and extensive bibliography and gray literature were screened for possible inclusion. No restrictions on publication year, publication status, or language were used.
For inclusion, studies must have compared viral load values using dried plasma spot specimens with the reference standard of liquid plasma specimens and measured by 1 or more of the following 7 commonly used technologies-Abbott RealTime HIV-1 on the m2000 platform (Abbott Molecular Inc, Abbott Park, IL), Generic HIV Charge Virale (Biocentric, Bandol, France), bioMérieux NucliSENS EasyQ HIV-1 v2.0 (bioMérieux, Craponne, France), Cavidi ExaVir Load (Cavidi, Uppsala, Sweden), Hologic Aptima (Hologic, Marlborough, MA), Roche Amplicor HIV-1 Monitor Test, v1.5 or COBAS AmpliPrep/COBAS TaqMan HIV-1 Test, v2.0 (Roche Molecular Systems Inc, Basel, Switzerland), and/ or Siemens VERSANT HIV-1 RNA 1.0 assay (kPCR) (Siemens Healthcare Diagnostics, Munich, Germany). No studies were found that used the Siemens VERSANT HIV-1 RNA 1.0 assay. Studies were not included if the index assay used was an in-house developed assay that lacks any international regulatory approval and/or cannot be procured traditionally by other countries or laboratories.

Study Selection and Systematic Review
Two reviewers independently screened all titles and abstracts for inclusion and reviewed all potentially relevant studies in full. Studies were included if they evaluated the accuracy of dried plasma spot specimens compared with that of traditional liquid plasma, were pertaining to viral load testing, and were performed using plasma prepared from blood sample of HIV-positive patient. Studies were excluded if they used spiked or pooled blood specimens or panels, they compared dried plasma spot specimens with plasma using a different assay, they performed a qualitative analysis of dried plasma spot specimens, or the comparator was a sample type other than liquid plasma.
Data were extracted and summarized for all included studies, outlining the study design, methods, and principle components of each study (eg, sample size, viral load assay used, and storage and transport conditions of specimens). Study characteristics were extracted from each manuscript or through author contact, when necessary. The primary outcome assessed was accuracy of the dried plasma spot specimens compared with that of plasma. Forest plots of the log bias and r-squared variables were developed to analyze the between-study heterogeneity of diagnostic performance.
Twenty-three studies were identified through the search strategy ( Fig. 1). We contacted the corresponding authors of all studies that met the inclusion criteria at least twice to explain the analysis plan and request primary data. For the meta-analysis, a total of 16 studies provided 18 data sets across 6 technologies resulting in a total of 1847 paired dried plasma spot and plasma viral load results. Data from the remaining 7 studies were not included in the meta-analysis because the study authors did not respond to the request to share.

Quality Assessment
The Standards for Reporting Studies of Diagnostic Accuracy criteria and Quality Assessment of Diagnostic Accuracy Studies-2 were followed and each study graded for quality. 35,36 Statistical Analyses-Meta-Analysis Study variables reviewed for each study included study sample size, viral load mean and median, proportion of patient specimens within specific viral load ranges, and sensitivity and specificity. Sensitivity and specificity were previously defined. 9 In brief, sensitivity was calculated as the proportion of dried plasma spot specimens correctly identified as failing or above the defined virological failure threshold compared with that of plasma. Specificity was calculated as the proportion of dried plasma spot specimens correctly identified as not failing or below the virological failure threshold compared with that of plasma. Primary data were then pooled to analyze the performance of dried plasma spot specimens for each technology. Viral load values were log-transformed because of the nonnormal distribution of the data. Because longitudinal data on dried plasma spot specimen performance were not available, cross-sectional comparisons were performed. In addition, lower treatment failure thresholds for viral load using dried plasma spot specimens were assessed including detectable (defined as any detectable result indicating treatment failure), 200, 400, 500, 600, 800, and 1000 copies/mL. Performance of dried plasma spot specimens was compared with that of plasma across each treatment failure threshold with measurements of true positives, true negatives, false positives, and false negatives calculated for each technology to create estimates of diagnostic accuracy of dried plasma spot specimens overall and for each platform across all studies. Using these treatment failure thresholds, the sensitivity, specificity, upward and downward misclassification rates, and positive and negative predictive values were also calculated. Upward misclassification was defined as the number of dried blood spot specimens incorrectly identified as above the tested treatment failure threshold divided by the total number of matched plasma specimens with viral load results less than 1000 copies/mL. Downward misclassification was defined as the number of dried blood spot specimens incorrectly identified as below the tested treatment failure threshold divided by the total number of matched plasma specimens with viral load results more than 1000 copies/mL.
Random effects models were used to estimate the summary measures for accuracy accounting for betweenstudy variation. For sensitivity and specificity values and corresponding 95% confidence intervals (CIs), bivariate random effects models designed to estimate summary sensitivity and specificity were used to simultaneously determine the estimates, accounting for the covariance of sensitivity and specificity and study-specific heterogeneity. 37 To obtain estimates of misclassification, univariate random effects models were used to obtain the point estimates and corresponding 95% CIs. [38][39][40] Graphic representations were completed in GraphPad Prism (La Jolla, CA), and analyses were completed in R version 3.4.3 (The R Foundation).

Protocol
The prepared protocol was reviewed by the World Health Organization and approved by Chesapeake Institutional Research Review Board (Columbia, MD; www. chesapeakeirb.com). 9 inclusion criteria and were published between 1997 and 2017 ( Fig. 1 and Table 1). The excluded studies were those that used incorrect specimen types (33) or incorrect assays (10), duplicates (3), review manuscripts (6), or had no comparator included (4).

Quality of Studies
The quality assessment found some risk of bias in patient selection, reference standard, and index test (Supplemental Digital Content Fig. 1, http://links.lww.com/QAI/ B754). In most studies, it was unclear regarding blinding and the timing of testing, whereas few stated how specimens were selected-and inclusion and exclusion criteria were often lacking. Furthermore, study design and patient/ specimen demographics were rarely stated or presented. In addition, most studies were conducted before 2011 (15: 65.2%). There was, however, applicability in patient selection, index test, and reference standard in most studies.

Systematic Review Analysis
The median study sample size was 47 specimens. The primary metrics conducted and included in studies were the linear regression (r 2 ) and log bias analyses ( Fig. 2 Table   2); however, no metric was consistently presented across all studies. Nearly all studies included only quantitative analytics and were published before the WHO recommendations; therefore, none presented data regarding a treatment failure threshold as is currently recommended by the WHO and practiced across most low-income and middleincome countries.
The mean bias was 0.28 log 10 copies/mL (dried plasma spot:plasma). All technologies had an r 2 greater than 0.75, except for the Biocentric Generic HIV Charge Virale technology (r 2 = 0.4485) (Fig. 4). For all technologies together, the median dried plasma spot viral load was 0.10 log 10 copies/mL, whereas the median plasma viral load was 2.35 log 10 copies/mL (Table 3). More dried plasma spot values were undetectable compared with plasma values (43.6% vs. 29.8%). There were a total of 560 undetectable plasma viral load results and 820 undetectable dried plasma spot results with 546 paired results being undetectable using both plasma and dried plasma spot. Ten of the 14 false detectable results using the dried plasma spot specimen were more than 1000 copies/mL with a median of 2250 copies/mL. There were 274 results that were detectable by plasma but undetectable by dried plasma spot specimen with a median plasma result of 56 copies/mL; however, only 20 had a plasma viral load result that was $1000 copies/mL. One hundred eighty of these 274 results (65.7%) had a plasma result that was less than 100 copies/mL.
The sensitivity and specificity of dried plasma spot specimens at lower thresholds remained relatively consistent across all lower thresholds analyzed (Table 3). When considering a treatment failure threshold of any detectable result, the sensitivity and specificity were 97.06% (95% CI: 87.41 to 99.37) and 98.69% (95% CI: 95.03 to 99.66). The performance of dried plasma spot specimens across treatment failure thresholds also remained consistent compared with the 1000 copies/mL treatment failure threshold when analyzed for each technology. For all technologies together, the total, upward, and downward misclassifications were all less than 8% across each of the 7 treatment failure thresholds analyzed. All the technologies, with the exception of Biocentric Generic HIV Charge Virale (upward), bioMerieux NucliSENS EasyQ HIV-1 (downward), and Hologic Aptima (downward), had total, upward, and downward misclassifications of less than 15%.

DISCUSSION
When dried plasma spot specimens were used for HIV-1 viral load testing, the diagnostic accuracy performance was relatively comparable with using traditional liquid plasma specimens. When analyzed across all technologies and treatment failure thresholds, the sensitivity and specificity remained greater than 92%. Furthermore, misclassification rates (total, upward, and downward) were low at less than 8%. These results are better and more consistent than a recent meta-analysis looking at the performance of dried blood spot specimens for viral load testing. 9 This is most likely the case because the specimen type in the current meta-analysis was the same (plasma) as the comparator, whereas dried blood spot specimens consist of whole blood and are likely to detect intracellular RNA and proviral DNA and the standard, circulating RNA. 9,41 Of interest, dried plasma spots were observed to sometimes have lower viral loads than the traditional liquid plasma specimens. In fact, 20.7% (274 of 1322) of all specimens that were detectable by plasma were undetectable by dried plasma spot specimens; however, only 20 of those plasma specimens (1.5% of 1322) were downward misclassified by the dried plasma spot specimen at the treatment failure threshold of 1000 copies/mL. The false undetectability observed was likely caused by the lower specimen input volume used for dried plasma spot specimens compared with the traditional liquid plasma. Most studies in this systematic review used 1-2 dried plasma spots of 50 mL each, yet 1 mm of plasma for the reference standard. Because of this, the limit of detection of dried plasma spots may be restricted by the smaller input volume and, thus, may not always detect those specimens with very low viral load values.
Although some challenges of false undetectability were observed, dried plasma spot specimens performed well and consistently at lower treatment failure thresholds. In fact, although the CIs were overlapping, the sensitivity and specificity were higher when a detectable treatment failure threshold was used compared with the 1000 copies/mL treatment failure threshold. This consistency should allow programs considering a lower treatment failure threshold to use this alternative specimen type if useful and feasible for their settings.
Programs across most high HIV burden countries still require novel solutions and innovations to improve access to viral load testing. Dried plasma spot specimens represent one potential innovation that may be able to support wider decentralization of viral load testing. One significant drawback to this technology, however, is the requirement for a centrifuge and human resource skills to separate plasma from the original whole blood specimen and spot onto the dried plasma cards. The spotting process, however, does not require traditional calibrated, scientific pipettes and techniques because each dried plasma spot takes a standard volume and the specimen can be applied using disposable plastic droppers or transfer dropper pipettes. The necessity for a centrifuge at the site of specimen collection is a significant challenge that may limit consideration. Furthermore, because most studies were conducted in developed settings, the feasibility in resource-limited settings is unclear, potentially limiting routine adoption. Alternative plasma separation methods would be helpful to allow for uptake and decentralization of this specimen type in settings in need of alternative approaches to access viral load testing.
Several alternative approaches have been developed more recently that try to take advantage of using plasma, yet  with simplified preparation techniques that can be more accessible to resource-limited settings. Plasma separation or filtration devices or cards have been developed that allow for application of whole blood directly to the device or card that, with or without further manipulation, result in plasma that can then be used for viral load testing. [42][43][44] Although these technologies may experience similar false undetectability challenges due to the small specimen input volume, the implications are likely to be similarly minor. Furthermore, such novel technologies will remove the requirement for onsite centrifugation and associated skills. However, as with any new specimen type, widespread uptake and decentralization require manufacturers to include alternative specimen types within intended use claims and regulatory submissions.
Most studies included in this systematic review analyzed their data with traditional quantitative measures, such as linear regression and Bland-Altman. Of interest, some studies did not include either metric, and there was poor consistency of the analyzed metrics across studies. Furthermore, no study analyzed their data considering the current application of viral load testing within the WHO recommendations and the treatment failure algorithm. This is likely primarily because most studies were conducted before the 2013 WHO guidelines when the WHO initially recommended viral load testing as the preferred modality to monitor treatment. 47 A metaanalysis on this topic was, therefore, critical to provide a better understanding of the performance of dried plasma spot specimens for viral load testing. Furthermore, key metrics should be considered in all future diagnostic accuracy studies, using this or other sample types for viral load, including linear regression, Bland-Altman, and more qualitative metrics such as sensitivity/specificity and misclassification across a variety of treatment failure thresholds.
This study had several limitations. First, although the overall sample size of the meta-analysis was large and allowed for precise overall conclusions to be made, several technologies had relatively small sample sizes when each of the technologies were analyzed independently. More precise conclusions, therefore, could not be made for the Biocentric Generic HIV Charge Virale, Hologic Aptima, and Roche Amplicor HIV-1 technologies. Additional studies using these and upcoming technologies will allow for more meaningful interpretations. The Roche Amplicor HIV-1 technology has been discontinued and is no longer in use. Second, dried plasma spot preparation techniques varied across studies, particularly in the dried plasma spot card drying and storage time and conditions, the number of spots used, and the preparation protocol. Currently, none of the suppliers have validated dried plasma spots within their instructions for use or WHO prequalification documents; therefore, it is difficult to compare the protocols used in these studies with any standard protocol. Furthermore, there was not always consistency among the same technology. Fortunately, however, the results remained relatively consistent and CIs tight for those that had reasonable sample sizes. In addition, similar issues were observed in a recent meta-analysis reviewing the performance of dried blood spot specimens 9 ; however, a subanalysis of manufacturer compliant studies did not perform significantly better. Third, although the studies spanned a number of countries and continents and could be considered generalizable, all studies conducted plasma separation and dried plasma spot specimen preparation in the laboratory from collected venipuncture specimens. This is unlikely to be the processing protocol if implemented in lowincome and middle-income settings to support broader uptake of viral load testing; therefore, additional studies are necessary to understand the performance and feasibility of dried plasma spot specimens in intended use, more decentralized, health care facility settings. Fourth, the studies included in the meta-analysis had a considerable number of detectable specimens (nearly 70%), suggesting that the population included in these studies and/or meta-analysis may not be representative of current programmatic settings that typically have observed suppression rates of .80%. 45,46 The positive and negative predictive values should, therefore, be cautiously interpreted. However, there were a substantial proportion of patients with low level viral loads (23% had a plasma viral load between 20 and 1000 copies/mL), and thus, the overall results remain informative. Fifth, unfortunately, not all authors shared primary data. Although nearly 70% of studies shared primary data for inclusion in the meta-analysis, the missing data could account for some potential bias in the results. Finally, due to the smaller sample sizes and lack of available patient demographic information, we were not able to conduct subanalyses focused on pediatric populations or people living with HIV who were on antiretroviral therapy. This systematic review and meta-analysis provided strong evidence that dried plasma spot specimens can be used for accurate viral load testing. Manufacturers should consider incorporating this specimen type within official communications and regulatory submissions, whereas country programs and implementing organizations can consider the utility of this specimen type in an effort to further decentralize and expand access to viral load testing.